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Field of the invention 

The present invention concerns generally the conmiunication between two or 
more processors. In particular, the present invention concerns the inter-processor communication 
between processors that are arranged on the same semiconductor die. 
10 Background of the invention 

As the demand for more powerful computing devices increases, more and more 
systems are offered diat comprise more than just one processor. 

For the purposes of the present invention, a distinction is to be made between 
computer systems that comprise two or more discrete processors and systems where two or more 
15 processors are integrated on the same chip. A computer with a main central processing unit 
(CPU) on a mofter board and an algorithmic processor on a graphics card is an example for a 
computer system with two discrete processors. Another example of a computer system witti 
several discrete processors is a parallel computer where an array of processors is arranged such 
that an improved performance is achieved. For sake of simplicity, systems on a board with two or 
20 more discrete processors are also considered to belong to the same category. 

There are systems where two or more processors are integrated on the same chip 
or semiconductor die. A typical example is a SmartCard (also referred to as integrated circuit 
card) that has a main processor and a crypto-processor on the same semiconductor die. 

As small handheld devices are becoming more and more popular, the demand 
25 for powerful and flexible chips is increasing. A typical example is the cellular phone which in 
the beginning of its dissemination was just a telephone for voice transmission (analogue 
communication). Over the years additional features have been added and most of today *s cellular 
phones are designed for voice and data services. Additional differentiators are wireless 
application protocol (WAP) support, short message system (SMS), and multimedia message 
30 service (MMS) functionality, just to name some of the more recent developments. All these 

features require more powerful processors and quite often even dual-processor or multi-processor 
chips. 

In the future, systems handling digital video streams for example will become 
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available. These systems also require powerful and flexible chip sets. 

Otfier examples are integrated circuit cards, such as multi-purpose JavaCards, 
small handheld devices, such as palm top computers or personal digital assistants (PDAs), video 
and audio devices, devices for use in automotives, and so fortii. 

It is essential for such dual-processor or multi-processor chips that there exists a 
communication channel for efRcient inter-processor conununication. The expression "inter- 
processor commimication" is herein used as a synonym for any communication between a first 
processor and/or system resources associated wi& ibis first processor and a second processor 
and/or system resources associated wilii this second processor. A shared memory (e.g., a random 
access memory) is an example of a system resoiirce that usually needs to be accessible by all 
processors of a chip. 

System resources have to be shared in an efficient manner in dual-processor or 
multi-processor chips where the processors operate in parallel on the same aspect of a task or on 
dilSerent aspects of the same task. The sharing of resources may also be necessary in applications 
where processors are called upon to process related data. 

An example of a multi-processor system is given in the European Patent 
application EP 0 580 961-Al, filed on 16 April 1993. This Patent application concerns a system 
with multiple discrete processors and a global bus that is shared by all these processors. 
Enhanced processor interfaces are provided for linking the processors to the conmion bus. Such 
multi-processor systems with a global bus cannot be realized using RISC processors, due to the 
high bus load which would have an impact on the system's perfonnance. The multi-processor 
system presented in EP 0 580 961-Al is powerful but complicated and expensive to implement. 
The shown structure cannot be used in multi-processor systems on a common die. 

Anotiier system is proposed in US Patent US 4,866,597, filed on 26 April 1985. 
This US Patent concerns a multi-processor system where each processor has its own processor 
bus. Data are exchanged between th^e processors via first-in-first-out data buffers (FIFO) which 
directly interconnect the respective processor buses. It is a disadvantage of Htns approach that the 
size of the buffers increases dramatically with the amount of data to be transferred. 

US Patent 5,093,780 concerns an mter-processor transmission system that has a 
data link which automatically reads and writes transfer data. A direct memory access (DMA) unit 
and a transmitter are assigned to a first processor and a receiver together with a DMA unit are 
assigned at a second processor. The processor has to set up the transfer by programming the 
corresponding DMA. That is, the processor has to know upfit)nt whether data are to be 
transferred. This is a disadvantage of the described inter-processor transmission system, since the 
respective processor needs to be mvolved. Another disadvantage of the said system is the fact 
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tibiat ttie whole transmission is mono-directional, i.e.„ the implementation is asymmetric. It is just 
possible to transfer data fit}m the memory 16 on &e left hand side of Figure 4 to the memory 26 
on the right hand side. 

A DMA controller for a multi-microcomputer system is disclosed in US patent 
5 5,222,227. The DMA controller has the function of controlling data transfer operations that are 
executed by the microcomputer systems. Separate address and data pipelines are provided. Tri- 
State-Technology is used for the buses. Hie buses CDB and SDB are at least temporarily 
electrically interconnected. As a consequence, both buses have to be operated at the same clock 
speed and both buses have to have the same bus widtii. According to the US patent 5,222,227, 
10 only homogeneous buses can be interconnected. There is no external DMA chaimel used in the 
system, presented. 

A multi-processor system with a shared memory is described and claimed in US 
Patent US 5,283,903, filed on 17 September 1991. The system in accordance with this US Patent 
comprises a plurality of processors, a shared memory (main memory), and a priority selector 

1 5 unit. The priority selector unit arbitrates between those processors die request access to the 
shared memory. This is necessary, since the shared memory is a single-port memory (e.g., a 
random access memory) that cannot handle simultaneous and competing requests from several 
processors. It is a disadvantage of this approach that the shared memory is expensive as only 
intermediate storage. The shared memory can get large with high data transfer. 

20 Anotiier multi-processor system is described in US Patent US 5,289,588, filed 

on 24 April 1990. The processors are coupled by a common bus. They can access a shared 
memory via this common bus. A cache is associated with each processor and an arbitration 
scheme is employed to control the access to the shared memory. It is a disadvantage of this 
approach that the cache memory is expensive as only big caches give a real performance boost. 

25 In addition, bus conflicts lead to a reduced performance of each processor. 

A microprocessor architecture is described in tiie PCT Patent application 
PCT/JP92/00869, filed on 7 July 1992, and published under PCT Publication number WO 
93/01553. The architecture supports multiple heterogeneous processors which are coupled by 
data, address, and control signal buses. Access to a memory is controlled by arbitration circuits. 

30 Some of the known multi-processor systems use architectures where the inter- 

processor communication occupies part of the processor's processing cycles. It is desirable to 
avoid this overhead and to free-up the processor's processing power in order to be able to better 
exploit the processor's capabilities and performance. 

Other known schemes cannot be used for integrated multi-processor systems 

35 where two or more processors are located within the same chip. 
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It is yet another disadvantage of some known systems that they are asymmetric 
in their implementation which means that different implementations are required for each 
processor. Furthermore, the effort for formal verification is greater for asymmetric than for 
symmetric implementations. 
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Summaiy of the Invention 

It is an object of tiie present invention to provide a scheme for efBcient data 
transfer between two or more processors and/or tiieir associated components. 

It is an object of the present invention to provide an inter-processor data transfer 
5 scheme that is suited for the integration into a semiconductor die. 

These and other objectives are achieved by the present invention which provides 
a system that comprises at least two integrated processors. According to the present invention, 
these two processors are operably connected via a communication channel for exchanging 
information. One processor (PI) has a processor bus, a shareable unit, and a DMA unit with two 
1 0 external DMA channels. The DMA unit and the shareable unit are coimected to the processor 
bus. The other processor also has a shareable unit and a DMA unit with two external DMA 
channels. Programmable units are employed enabling tfie processor to set-up the desired 
communication links. Due to this arrangement, two bi-directional communication channel are 
establishable between the two bus regimes. 
15 The two or more processor can be arranged on a common semiconductor die. 

This allows to realise computing devices, such as PDAs, handheld computers, palm top 
computers, cellular phones, and cordless phones, for example. 

The communication channel can be used advantageously for communication 
between two or more processors and/or their associated components. The inventive arrangement 
20 suits general multi-core communication needs. The arrangement is highly synmietrical and it 
allows to minimise the number of otherwise needed bus masters for each processor. The present 
scheme is expandable and very flexible. 

Hiese and other aspects of the invention will be apparent from and elucidated 
with reference to the embodiment(s) described hereinafter. 
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Brief description of the drawings 

For a more complete description of the present mvention and for further objects 
and advantages thereof, reference is made to tiie following description, taken in conjunction wifli 
the accompanying drawings, in which: 

FIG. 1 is a schematic block diagram of a dual-processor computer system, 

according to a 

first embodiment of the present invention. 

FIG. 2 is a schematic illustration of an inter-processor communication system 
according to the present invention. 

FIG. 3 is a detailed block diagram of tiie inter-processor communication 

system of Figure 2. 

FIG. 4 is a schematic block diagram of a dual-processor computer system, 
according to 

another embodiment of the present invention. 

FIG. 5 is a schematic block diagram of a dual-processor computer system, 
according to 

another embodiment of the present invention. 

FIG. 6 is a detailed block diagram of the DTU unit of Figure 5. 
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DESCRIPTION OF PREFERRED EMBODIMENTS 
The present invention is described in connection with several embodiments. 
As shown in Figure 1, a dual-processor system to which the present invention is 
applied comprises a first processor PI that is connected via a first processor bus 10 to a first 
5 shareable unit 13. A processor bus (also called microprocessor bus) is tiie main path connecting 
to flie computer system's processor. An example of a shareable unit 13 or 23 is a shared memory 
(e.g., a random access memory; RAM). The first processor bus 10 is a 64 bit, 20MHz bus. The 
system comprises a second processor P2 that also has a processor bus 20. This second processor 
bus 20 is a 64 bit, 66MHz bus. An interconnection between the two processor environments 1 8 

10 and 28 (schematically illustrated by ovals in Figure 1) is established via two bi-directional 

communication channels 1 1 and 21. The first bi-directional channel 1 1 is programmable by the 
processor Pl, as indicated by the arrow 12, and the second channel 21 is programmable by the 
processor P2, as indicated by the arrow 22. The two bi-directional channels 1 1 and 21 are 
hereinafter referred to as intercore communication system 9. 

1 5 More detaib of the first embodiment are depicted in Figure 2. The intercore 

communication system 29 comprises a first DMA xmit 45 (DMAl) with a first and a second 
external DMA channel 46, 47. The first DMA unit 45 is coimectable to the first processor btis 10 
via an intemal DMA channel 49. It furthermore comprises a first double tandem unit (DTU) 34 
(DTUl) which is coimectable via the first external DMA channel 47 to the first DMA unit 45. 

20 The DTU unit 34 is programmable by the first processor PI , as indicated by the arrow 32, and 
the first DMA unit 45 is programmable by tiie first processor Pl, as indicated by the arrow 132 
In addition, the intercore communication system 29 comprises a second DMA unit 35 (DMA2) 
and a second DTU unit 44 (DTU2). The DMA unit 35 has a first and a second external DMA 
channel 36, 37, and an intemal DMA channel 39. The second DMA imit 35 is connected via the 

25 intemal DMA chaimel 39 to the second processor bus 20. The second DMA unit 35 and the 

second DTU unit 44 are connectable via the first external DMA chaimel 37. The DTU unit 44 is 
programmable by the second processor P2, as indicated by the arrow 42, and fte second DMA 
imit 35 is progranmiable by the second processor P2, as indicated by the arrow 142 A first bi- 
directional communication channel is implemented by the first DTU 34 and a second bi- 

30 directional channel is implemented by die second DTU 44. Each DTU 34, 44 is directly 

connectable to the processor for programming/configumtion purposes, as illustrated in Figure 2. 
An interconnection between the two processor environments 38 and 48 (schematically illustrated 
by ovals in Figure 2) is established by a bi-directional data transfer. 

It is a novel feature of the embodiment given in Figures 1 and 2 that the 

35 programming of one DTU imit, e.g. DTUl, allows (bi-directional) data transfer fi-om the 
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processor environment 48 to the processor environment 38 and vice versa without the 
programming of any other resource. Data can be moved to the DMAl and fetched from the 
shareable unit 23 that is attached to the second processor bus 20. The DTU2 is able to move data 
to the DMA2 and to fetch data from tiie shareable unit 1 3 that is attached to the first processor 
5 bus 10. 

The dual-processor arrangement illustrated in Figures 1 or 2 allows the second 
processor P2 to access the shareable unit 13. The shareable unit 23 is accessible by flie processor 
PI. 

In more general terms, one processor (processor P2 in the present embodiment) 
10 of a multi-processor system in accordance with the present invention is able to access resources 
(the shareable unit 13 in the present embodiment) that are associated with anoflier processor 
(processor PI in the present embodiment). A resource of another processor on a remote bus may 
be accessed for data up and download from cheap remote memory, for instance. A processor may 
for example access the memory of a co-processor to fetch data that were computed by the co- 
1 5 processor. These are just two typical examples of situations where a first processor accesses 
resources on a remote bus. 

Various tj^es of processors can be interconnected using the present scheme. It 
allows to realise chips with multiple homogeneous processoi^ or even with multiple 
heterogeneous processors. The word processor is herein used as a synonym for any processing 
20 unit that can be integrated into a semiconductor chip and that actually executes instructions and 
works with data. 

Complex instruction set computing (CISC) is one of the two main types of 
processor designs in use today. It is slowly losing popularity to reduced instruction set computing 
(RISC) designs. The most popular current CISC processor is the.x86, but there are also 68xx, 
25 6Sxx, and Z80s in use. 

Currently, Ae fastest processors are RISC-based. There are several popular 
RISC processors, mcluding Alphas (developed by Digital and currently produced by 
Digital/Compaq and Samsung), ARMs (developed by Advanced RISC Machines, currently 
owned by Intel, and currently produced by both the above and Digital/Compaq), PA-RISCs 
30 (developed by Hewlett-Packard), PowerPCs (developed in a collaborative effort between IBM, 
Apple, and Motorola), and SPARCs (developed by Sun; the SPARC design is currently 
produced by many different companies). 

ARMs are different from most other processors in that they were not designed to 
maximise performance but rather to maximise performance per power consumed. Thus ARMs 
35 find most of their use on hand-held machines and PDAs. . 
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In the above sections some examples of tiie processors were given that can be 
interconnected in accordance with tiie present invention. Also suited are Digital Signal 
processors (DSPs), tiie processor cores of any of the known processors, and customer specific 
processor designs. Jn other words, the present concept is applicable to most microprocessor 
5 architectures. One can even interconnect a processor witii a slow processor bus and a processor 
witii a fast processor bus. 

For the purpose of the present application, the following is also considered to be 
a processor: central processing unit (CPU), microprocessor, digital signal processor (DSP), 
system controller (SC), co-processor, auxiliary processor, control unit and so forth. 

10 A direct memory access (DMA) unit is a imit that is designed for passing data . 

from a memory to another device without passing it through the processor. A DMA typically has 
one or more dedicated intemal DMA channels and one or more dedicated external DMA 
channels for external peripherals. Such an external DMA channel — contrary to an intemal DMA 
channel that is controlled by the processor to which it is associated - is set-up by external agents 

IS in order for ttie remote processor to get access to another processor's shareable unit For 
instance, a DMA allows devices on a processor bus to access memory without requiring 
intervention by the processor. 

Examples of shareable units are: volatile memory, non-volatile memory, 
peripherals, mterfaces, input devices, output devices, and so forth. 

20 The intercore communication system, according to the present invention, 

decouples the data flow between the clock domain of a &st processor PI and the clock domain 
of . a second processor P2. This means that within the limits of tiie inventive data transfer system, 
the activity on one processor does not require simultaneous and equivalent activity on the other 
processor. 

25 Details of the intercore communication system 29, according to the present 

invention, are described in connection with Figure 3. The intercore conununication system 29 
comprises a first DTU 34 (DTUl), a second DTU 44, (DTU2), a first DMA 45 (DMAl), and a 
second DMA 35 (DMA2). In the present embodiment, the DMA unit 35 comprises two external 
DMA chaimel units 56, 57. The intemal chaimel 39 of these two external DMA channel units 

30 56, 57 is coimected to the processor bus 20. 

The first external DMA channel unit 56 is connected via a link 36 to the second 
DTU 44. Hie second external DMA chaimel unit 57 is connected via a link 37 to the first DTU 
34. The first DMA unit 45 comprises two external DMA channel units 54, 55. The intemal 
channel 49 of these two external DMA channel units 54, 55 is coimected to the processor bus 10. 

35 The first external DMA channel unit 55 is connected via a link 47 to the first DTU 34. Hie 
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second external DMA channel unit 54 is connected via a link 46 to the second DTU 44. ITie 
internal channel 49 of these two external DMA channel units 54, 55 is connected to the 
processor bus 10. 

The DTU 34 comprises a first processor interface 60 allowing a programming 
5 link 52 to be established via the processor bus 10 to the processor PI (not shown m Figure 3). 
The DTU 34 further comprises a direct access unit core (DAU core) 62, and two external DMA 
channel interfaces 61 and 63. The external DMA channel interface 61 serves as interface to the 
external DMA channel unit 55 and tiie external DMA channel interface 63 serves as interface to 
the external DMA channel unit 57. 

10 The DTU 44 comprises a first processor mterface 50 allowing a programmmg 

link 51 to be established via the processor bus 20 to the processor P2 (not shown in Figure 3). 
The DTU 44 further comprises a direct access unit core (DAU core) 52, and two external DMA 
channel interfaces 5 1 and 53. Hie external DMA channel mterface 51 serves as interface to the 
external DMA channel unit 56 and the external DMA channel interface 53 serves as interface to 

15 the external DMA chaimel unit 54. 

The clock signal of the first processor PI (clockl) is fed via a clock line 58 to 
the following units: external DMA channel unit 54, external DMA channel unit 55, external 
DMA channel interface 53, external DMA channel interface 61, and DAU core 62. The clock 
signal of the second processor P2 (clock2) is fed via a clock line 59 to the following units: 

20 external DMA channel unit 56, external DMA channel unit 57, external DMA channel interface 
51, external DMA channel interface 63, and DAU core 52. 

The processor PI configures the first DTU 34 by means of the first processor 
interface 60. The DAU core 62 of tiie DTU 34 is the control logic for the two external channel 
interface units 61 and 63. The DAU core 62 furthermore performs tiie data transfers ideally 

25 enhanced by a first-m first-out (FIFO). The same way the processor P2 configures the second 
DTU 44 via the second processor interface 50. In both cases the external channels of the first 
DMA unit 45 use the resources of the intemal DMA channel 49 on the processor bus 10, and the 
extemal channels of the second DMA unit 35 usetiie resources of the intemal DMA channel 39 
on the processor bus 20. 

30 As illustrated in Figure 3, the intercore communication system 29 provides for a 

clock decoupling. All the blocks are either clocked by the clockl of the processor PI or by the 
clock2 of the processor P2 such fliat tiie acdvity on one processor does not require simultaneous 
and equivalent activity on the other processor. 

In cases where there is no phase and/or fi-equency relationship between the 

35 signals clockl and clock2, the DAU cores 52, 62 can be unplemented such fliat they are enabled 
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to provide safe data transfers by means of appropriate handshaking signals. Hiese handshaking 
signals are active between the DAU core 52 and the external DMA channel interface 53 as well 
as 

between the DAU core 62 and the external DMA channel inter&ce 63. 
5 The external DMA channel interfaces and/or the DAU cores can be 

standardised. In other words, each DTU or DMA, according to flie present invention, may 
contam an identical functional core. Only the processor interface has to be adapted depending on 
the actual processor and/or processor bus employed. This leads to a reduced development time 
due to maximising of re-use and reduced verification effort 
1 0 According to the present invention, a DMA unit is connected via its internal 

interface to a processor bus and via its external interface to a DTU. The external interface may 
be 8 bits wide. 

The processor interface has a programming input (e.g. input 52 in Figure 3), 
since this interface serves for the programming of the DTU in which it is comprised. The 

1 5 processor interface does not require any data link to the processor bus, since any data exchanged 
is handled by the DTU's external DMA channel interfaces. The setup and configuration of the 
bi-directional channel is done by a processor by progranoming via tfie processor interface the 
respective DTU's DAU core. 

The DTU 34, for instance makes use of the external DMA channel 47 in order to 

20 transfer information (data and/or control information) to and from flie shareable unit 13. 

Another embodiment is illustrated in Figure 4. A system is illustrated fliat 
comprises a first processor PI, a first processor bus 70, and a first shareable unit 76 being 
cormected to the first processor bus 70. There is a second processor P2, a second processor bus 
80 and a second shareable imit 86 attached thereto. A first bi-directional communication channel 

25 is establishable via a first DMA unit 83 with an external DMA channel 85. Hie first DMA unit 
83 is connectable to the first processor bus 70. A first DTU unit 82 is provided. The first DTU 
unit 82 is connectable via the external DMA channel 85 to the ffrst DMA unit 83. The DTU unit 
82 is programmable by the first processor PI . The programming takes place via the processor 
bus 70 and a programming link 84. Furthermore, a master unit (master2) 81 is provided. This 

30 master imit 81 serves as an interface between the first DTU 82 and the second processor bus 80. 
A second bi-directional communication channel is establishable via a second DMA unit 73 with 
an external DMA channel 75. The second DMA unit 73 is connectable to tiie second processor 
bus 80, A second DTU unit 72 is provided. The second DTU unit 72 is connectable via flie 
external DMA channel 75 to the second DMA unit 73, The DTU unit 72 is programmable by the 

35 second processor P2. The programming takes place via the processor bus 80 and a programming 
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link 74. Furthermore, a master unit (masterl) 71 is provided. ITiis master unit 71 serves as an 
interface between the second DTU 72 and tfie first processor bus 70. A master in the present 
context is a unit being able to initiate (and continue) data transfers on the processor buses. The 
masters therefore need to have access to some kind of arbitration (prioritization) on the buses 
5 (this prioritization is not part of the present patent application). The masters have an addressing 
circuitiy used to select the active device on the processor bus. 

Another embodiment is illustrated in Figure 5. An intercoie communication 
system 99 is provided that allows to establish two bi-diiectional channels between the two 
processor busses 90 and 100. The processor PI may be a digital signal processor (DSP) core, and 

10 the processor P2 may be a system controller (SC) core, for example. In the present embodiment, 
there is one conmaon DTU unit 92 which for example comprises the functional elements of the 
DTUl and DTU2 of Figures 2, 3, or 4. One part of this conraion DTU 92 is programmable by 
the processor PI, as mdicated by the arrow 104, and tixe other part is programmable by the 
processor P2, as indicated by tiie arrow 94. Details of tiiis DTU 92 are depicted in Figure 6. 

1 5 The common DTU 92 comprises a first processor interface 1 20 allowing a 

programming link 104 to be established via the processor bus 90 to the processor PI (not shown 
in Figure 6). Hie DTU 92 further comprises a dkect access unit core (DAU core) 122, and two 
external DMA channel interfaces 121 and 123. The extemal DMA channel interface 121 serves 
as mterface to tfie DMAl unit 101 and the extemal DMA channel mterface 123 serves as 

20 interfece to the DMA2 unit 93 . The DTU.92 further comprises a second processor interface 1 1 0 
allowing a programming link 94 to be established via the processor bus 100 to Uie processor P2 
(not shown in Figure 6). The DTU 92 further comprises another direct access unit core (DAU 
core) 1 12, and two extemal DMA channel interfaces 111 and 113. The extemal DMA channel 
interface 111 serves as mterfece to the DMA2 unit 93 and the extemal DMA channel mterface 

25 113 serves as interface to the DMAl unit 101. How the two clock signals clockl and clock2 are 
applied is also shown in Figure 6. 

The DTU 92 programmmg is preferably done using two separate register sets, 
each register set being assigned by one processor. PI or P2. This allows to avoid conflicts with 
simultaneous accesses performed by the two DAU cores 1 12 and 122. However, a prioritisation 

30 scheme is required tiiat allows to prioritise requests fi-om the processor PI or requests fi-om tiie 
processor P2. The following two schemes are proposed: 

No priori^ is specified and the operation is based on the first come first serve 
principle, i.e., the processor that comes fist has the priority over the other processor. 
Ongoing transfers are always completed and new transfers are put on a waiting queue.. 

35 - The processor PI has priority over the processor P2, or vice versa. An ongoing 
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transfer of low priority data can be interrupted by a request submitted by the processor 
that has the higher priority. The interrupt of the transfer h^pens transparent to tiie low 
priority core. After the high priority request has finished, Ae low priority request is 
resumed. A high priority transfer is never interrupted. 
5 According to the present mvention, tiie DTU units make use of external DMA 

chaimels to transfer data to/from tfie shareable unit tiiat is connectable to tiie processor bus of the 
other processor. Such an external DMA channel, contraiy to the internal DMA channels which 
are programmed by the respective processor, are set-up by extemal agents in order to get access 
to tfie resources of the other processor. The extemal agents in tiiis patent application are the 
10 commands programmed by a remote processor to have access to a resource on the local processor 
- the internal DMA channels are programmed by tfie local proc^sor itself* 

The present invention can also be employed in systems with more than two 
processors. A tiiird processor might be connected via its own processor bus, a third DMAS unit 
. and a third DTU3 to the DMA2 unit of the second processor, for example. This would allow the 
15 third processor to establish a bi-directional channel to resources that are associated wifli the 
second processor. 

In yet anoAer embodiment of the invention, two or more processors and a 
communication channel for inter-processor communication in accoixlance with the present 
invention, are integrated into a custom application specijSc integrated circuit (ASIC). 

20 It is an advantage of the architecture presented and claimed herein that it 

supports multiple heterogeneous processors. The inventive scheme can be expanded to suit 
general multi-core communication needs. Due to the present invention, the number of bus 
masters for each processor can be reduced, as potentially available DMA xmits can be used for 
this purpose. The concept and design reuse is anoAer advantage. Different other advantages 

25 have been mentioned in cotmection with the various embodunents of tiie present invention. 

The proposed architecture is symmetric and applicable to most microprocessor 
architectures. It can be expanded to multi-core architectures, i.e., it is independent of the number 
of cores. 

The present invention is well suited for use in computing devices, such as PDAs, 
30 handheld computers, pabn top computers, and so forth. It is also suited for being used m cellular 
phones (e.g., GSM phones), cordless phones (e.g., DECT phones), and so forth. The architecture 
proposed herem can be used m chips or chip sets for the above devices or chips for Blue tooth 
applications. 

It is appreciated that various features of tiie mvention which are, for clarity, 
35 described in tiie context of separate embodunents may also be provided m combination m a 
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single embodiment Conversely, various features of tiie invention which are, for brevity, 
described in the context of a single embodiment may also be provided separately or in any 
suitable sub combination. 

In tiie drawings and specification there has been set forth preferred embodiments 
of Ae invention and, although specific terms are used, the description thus given uses 
terminology in a generic and descriptive sense only and not for purposes of limitation. 



